Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques

نویسندگان

  • Kevin Skadron
  • Pritpal S. Ahuja
  • Margaret Martonosi
  • Douglas W. Clark
چکیده

Design parameters interact in complex ways in modern processors, especially because out-of-order issue and decoupling buffers allow latencies to be overlapped. Tradeoffs among instruction-window size, branch-prediction accuracy, and instructionand datacache size can change as these parameters move through different domains. For example, modeling unrealistic caches can underor over-state the benefits of better prediction or a larger instruction window. Avoiding such pitfalls requires understanding how all these parameters interact. Because such methodological mistakes are common, this paper provides a comprehensive set of SimpleScalar simulation results from SPECint95 programs, showing the interactions among these major structures. In addition to presenting this database of simulation results, major mechanisms driving the observed tradeoffs are described. The paper also considers appropriate simulation techniques when sampling full-length runs with the SPEC reference inputs. In particular, the results show that branch mispredictions limit the benefits of larger instruction windows, that better branch prediction and better instruction cache behavior have synergistic effects, and that the benefits of larger instruction windows and larger data caches trade off and have overlapping effects. In addition, simulations of only 50 million instructions can yield representative results if these short windows are carefully selected.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characterizing and Removing Branch Mispredictions

Control-flow mispredictions are a profound impediment to processor performance, because each misprediction introduces a pipeline bubble of many cycles’ duration. For example, the minimum bubble in the recently released Alpha 21264 is at least seven cycles, and often as much as twenty cycles. With such long penalties, even small misprediction rates harm performance substantially. Although a huge...

متن کامل

Instruction prefetching using branch prediction information

Instruction prefetching can effectively reduce instruction cache misses, thus improving the performance. In this paper, we propose a prefetching scheme, which employs a branch predictor to run ahead of the execution unit and to prefetch potentially useful instructions. Branch prediction based (BP-based) prefetching has a separate small fetching unit, allowing it to compute and predict targets a...

متن کامل

Understanding the Backward Slices of Performance Degrading Instructions Craig

For many applications, branch mispredictions and cache misses limit a processor's performance to a level well below its peak instruction throughput. A small fraction of static instructions, whose behavior cannot be anticipated using current branch predictors and caches, contribute a large fraction of such performance degrading events. This paper analyzes the dynamic instruction stream leading u...

متن کامل

Multiple Branch Prediction for Wide - Issue Superscalar ∗

Modern micro-architectures employ superscalar techniques to enhance system performance. Since the superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. There are cases that multiple branches are often encountered in one cycle. And in practical implementation this would cause serious problem while ...

متن کامل

Energy-delay efficient filter cache hierarchy using pattern prediction scheme - Computers and Digital Techniques, IEE Proceedings-

Filter cache (FC) is an auxiliary cache much smaller than the main cache. The FC is closest in hierarchy to the instruction fetch unit and it must be small in size to achieve energyefficient realisations. A pattern prediction scheme is adapted to maximise energy savings in the FC hierarchy. The pattern prediction mechanism proposed relies on the spatial hit or miss pattern of the instruction ac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 48  شماره 

صفحات  -

تاریخ انتشار 1999